Forecasting and change point test for nonlinear heteroscedastic time series based on support vector regression

SVR-ARMA-GARCH models provide flexible model fitting and good predictive powers for nonlinear heteroscedastic time series datasets. In this study, we explore the change point detection problem in the SVR-ARMA-GARCH model using the residual-based CUSUM test. For this task, we propose an alternating recursive estimation (ARE) method to improve the estimation accuracy of residuals. Moreover, we suggest using a new testing method with a time-varying control limit that significantly improves the detection power of the CUSUM test. Our numerical analysis exhibits the merits of the proposed methods in SVR-ARMA-GARCH models. A real data example is also conducted using BDI data for illustration, which also confirms the validity of our methods.


Introduction
Non-linearity and conditional heteroscedasticity are two major characteristics of financial time series data. The SVR-ARMA-GARCH models proposed by [1] provide flexible model fitting and good predictive ability for data with these two characteristics. However, major events such as the financial tsunami and the COVID-19 pandemic often bring shocks and changes to the financial market, making the original models no longer suitable for future decisions. Therefore, it is important to detect the occurrence of change points in time to update the model. The objective of this paper is twofold: the first is to improve the forecasting capability for more accurate parameter/residual estimation for the SVR-ARMA-GARCH time series, and the second is to enhance the detection power of the residual-based CUSUM test of [2] when changes occur near the current observing time.
For the first, we propose an alternating recursive estimation (ARE) method which estimates jointly the parameters of the conditional mean and conditional variance equations of SVR-AR-MA-GARCH models. Our numerical studies show that the proposed ARE method outperforms [1] method in terms of forecasting performance and thereby improves the accuracy of parameter/residual estimation in SVR-ARMA-GARCH models. This improvement in the residual estimation also greatly enhances the detection ability of a change point. For the second, we aim to improve the residual-based CUSUM test proposed by [2]. Their test only uses time series observations and model-based residuals, so that it has merits to be simpler and more flexible than other types of CUSUM tests. We refer to [3] for a theoretical background and history of CUSUM tests for time series, particularly, the ones based on residuals. Although [2] test performs adequately in many situations, T max has a shortcoming to yield low powers when a change is near current observing time, which results in a late detection. To resolve this problem, we here propose an alternative testing procedure by supplying a timevarying upper control limit (UCL) for T s , wherein the underlying basic process asymptotically behaves like a standardized (by its time point) Brownian bridge, and additionally, the test statistic T scale max of [4] whose limiting process is the supremum of a Brownian bridge standardized by its maximizing point.
Our numerical analysis reveals that those three tests have their own merits of yielding high powers in different stages of time period, namely, T scale max , T max , and T s with UCL (T UCL s ) perform the best for detecting a change in an early, middle and late stage, respectively. Since the CUSUM test under consideration is retrospective, the late stage of time period implicates the time close to the currently observing time, and therefore, the last testing procedure is conducive to quickly detecting a change point near the current time and makes a more suitable test for a dynamic monitoring scheme and timely update of models particularly when a moving window scheme is adopted.
The organization of this paper is as follows. Section 2 introduces the proposed alternating recursive estimation (ARE) method for the SVR-ARMA-GARCH model. Section 3 introduces the aforementioned three residual-based CUSUM tests. Simulation and empirical studies are conducted in Sections 4 and 5 for illustration. Section 6 provides the conclusions.

ARE of the SVR-ARMA-GARCH models
Time series forecasting is crucial to forecast the characteristics of time series and detect anomalies in statistical monitoring process. Conventionally, linear ARMA models have been used for this purpose, but as time series often has significant nonlinear features, the forecasting result based on the ARMA models is inaccurate and hard to utilize for the application to monitoring processes. As an alternative, researchers can consider using support vector regression (SVR). SVR originates from [5,6] statistical learning theory, which uses nonlinear functions to convert input variables into a higher dimensional space to helps explore information that can only be observed in higher dimensional space. Under the varieties of circumstances, it provides flexibility and excellent forecasting accuracy, and satisfies the structural risk minimization principle, see [7][8][9]. This far, the SVR has been applied to various practical problems; for example [10], used it for stock prediction and [11] used it for anomaly detection.
Chen et al. [1] adopted the SVR method to estimate the parameters in SVR-based nonlinear GARCH models and showed that the SVR-GARCH models significantly outperform classical parametric models in the respect of one-period-ahead volatility forecasting. In this study, we aim to improve and refine their method as addressed below.
Let r t denote an asset return at time t. The r t is said to follow an ARMA(p, q)-GARCH(m, s) model if where Z t ¼ a 2 t À s 2 t is a martingale difference sequence and s 2 t ¼ Varða t jF tÀ 1 Þ: It is understood that α i = 0 for i > m and β j = 0 for j > s.
The SVR-ARMA-GARCH model replaces the linear functions in Eqs (1) and (2) with the nonlinear functions h(�) and g(�) and is expressed as: where hðr t;p ; a t;q Þ ¼ w T h � h ðr t;p ; a t;q Þ; The functions ϕ h (�) and ϕ g (�) are defined using a Gaussian kernel, i.e.
Chen et al. [1] proposed to estimate w h and w g separately by optimizing the two objective functions in Eqs (5) and (6) below. Namely, Eq (5) is optimized to obtain w h and the residual a t ¼ r t À hðr t;p ;â t;q Þ, andâ t is used to optimize Eq (6) for obtaining w g : and min w g However, this method does not consider the impact of volatility on the returns when estimating w h . As such, we replace the residuals in Eq (5) with the standardized residuals in Eq (7) to estimate w h as follows: To optimize Eqs (6) and (7), we adopt an alternating recursive estimation (ARE) method illustrated in Fig 1. Namely, at each iteration, we fix w g to update w h , then fix w h to update w g , and continue to alternate this estimation procedure until the two residual sequences sufficiently converge to each other. This method performs more functionally in computing the predicted values and residuals, used for detecting a change point as described in Section 4 below, wherein the predominance of the ARE method is demonstrated empirically through Monte Carlo simulations.

Residual-based CUSUM test
This section introduces three different test statistics to detect a change point.
T max of Lee et al. [2]. Let us consider the location-scale time series model of the form: y t ¼ h t ðy tÀ 1 ; y tÀ 2 ; :::Þ þ g 1=2 t ðy tÀ 1 ; y tÀ 2 ; :: where η t are i.i.d. random variables with mean 0 and a fourth moment. The h t (�) and g t (�) indicate the conditional mean and variance of y t at time t. A structural change occurs if h t (�) and g t (�) experiences a change. Given observations y 1 , . . ., y n , we set up the null and alternative hypotheses: H 0 : h t ð�Þ and g t ð�Þ remain the same for the whole period: vs: H 1 : not H 0 : To test these hypotheses [2], used the test: According to [13], Hence, for a given significance level α, we can find a critical value K from Eq (9) satisfying In particular, when α = 0.05, the corresponding K is 1.358. When T max is greater than the critical value K, we reject the null hypothesis.
Ferger [4] obtained the limiting distribution of T scale max and used it to improve the power of the Kolmogorov-Smirnov test. More specifically [4], proved that the c.d.f. of R: 1Þ jB s j ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi has a form of for x � 0 and α i = 2i + 1, where F and ϕ denote the distribution function and its density of a standard normal random variable. For a given significance level α, the null hypothesis is rejected if T scale max is greater than the critical value F À 1 R ðaÞ. In particular, using Eq (13), we obtain F À 1 R ð0:05Þ ¼ 2:795 and F À 1 R ð0:10Þ ¼ 2:5. T s with a time-varying control limit. Although T scale max can overcome a shortcoming of T max , it only considers the standardized T max but not the standardized values at other time points. As such, we here construct the time-varying control limits for T s based on the fact: where A n ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2 log log n p and D n ¼ 2 log log n þ 1 2 log log log n À 1 2 log p; refer to Corollary A.3.1 of [14]. For example, when α = 0.05 and n = 500, we have This result enables us to set up an adjusted control limit for improvement, but for small s, there can be a concern regarding excessive type 1 errors in its implementation. For example, when the AR (1) model below considered, where a t are i.i. d. N(0, 1) errors, the test based on 500 observations from this model shows that the rejection ratio of the null hypothesis out of 500 replications is 0.114 which is much larger than the significance level of α = 0.05.
In Fig 4, the x-axis represents the time points where the type 1 error occurs, and the y-axis represents the number of occurrences. The figure indicates that the control limit based on Eq (14) cannot perform well in an early stage.
To overcome this problem, we propose to use the control limit for T s , s 2 (0, 1), as follows: Then, we reject the null hypothesis when T s > UCL(s) for some s 2 (0, 1). We name this testing method T UCL s . For a given significance level α, the upper control limit can be obtained via finding K, K � , and s � in Eq (16) through the following steps: 1. Find K with P(sup|B s | > K) = α.
2. Simulate 100,000 standard Brownian bridge paths of sample size n.  We further investigate the stability of the critical value K � vs. the sample size n. Fig 7 shows that when the sample size n < 400, the value of K � increases as n grows gradually, but when 400 � n � 1000, the value of K � looks stable.

Numerical analysis
We evaluate the performance of the SVR-ARMA-GARCH model using the proposed ARE method and the residual-based CUSUM test based on T s . For this task, we generate the data following a Lorenz system, namely, y t , t = 1, 2, . . ., n, is from the following model: where β t and γ t originate from the Lorenz system, as shown in Eq (19). Under the null of no changes, we consider ϕ = 1. The Lorentz system is a three-dimensional dynamic system, obtained from the convective volume equation in the atmospheric equation: For each simulation, 1100 observations are simulated, and the first 100 are removed. The first 500 observations are used as a training set to fit the three models: ARMA-GARCH, SVR-ARMA-GARCH based on the ARE and the method of [1], and the last 500 observations are used as a test set for prediction. The last 200 observations in the training set are used as a validation set to decide the free parameters of the SVR-ARMA-GARCH model. Fig 8 shows a time series realization simulated by Lorenz system and the partition of training, validation and testing sets. We use rolling window to evaluate the 500 one-step-ahead forecasts of the fitted three models respectively in the testing period.
Evaluation metrics. We use five evaluation metrics to measure the prediction accuracy. Let y t denote the series of the returns to predict, andâ t ¼ y t Àŷ t denote the estimated residuals.
• Evaluation metrics for conditional mean predictorŷ i (i). NMSE: Normalized mean square error based on y i , (ii). NMSE_COND: Normalized mean square error based on the conditional mean of y i , (iii). Sign error:  Table 1 shows the averaged performance for each evaluation metric after 500 repetitions. As the data is generated from the nonlinear Lorenz system, the performance of the ARMA-GARCH model is shown to be flawed. A pairwise t-test demonstrates that the ARE method performs significantly better than that of [1].
Next, we evaluate the performance of the three tests: T UCL s , T max , and T scale max , using the SVR-ARMA-GARCH and ARE method to fit it into the dataset and estimate the residuals. Each simulation is conducted with the samples of size 500 and the significance level α = 0.05. Then, under the null hypothesis, the ratios of rejection of the three tests appear to be 0.03, 0.038, and 0.044, respectively, which are all close to α = 0.05. In Fig 9, the x-axis represents the time points where the type 1 error occurs, and the y-axis represents the number of occurrences in five hundred experiments.
Finally, we investigate the power of the three tests by changing the value of the parameter ϕ in the test set at different time points and compute the proportion of rejecting the null hypothesis. Fig 10 presents the powers of the three tests at different locations. The left side presents the changed parameter. The x-axis stands for the position where the change point occurs, while the y-axis presents the rejection rate of the null hypothesis. The results exhibit that when the change point appears in an early period, the power of T scale max is 20% higher than that of T max , and in contrast, when the change point appears in a later stage, the power of T UCL s is 60% higher than that of T max .

Empirical study of the BDI data
Data description. The Baltic Dry Index (BDI) is issued daily by the Baltic Exchange in London. The BDI is a composite of the Capesize, Panamax, and Supramax Timecharter Averages. This is reported worldwide as a proxy for dry bulk shipping stocks and a general shipping market bellwether. We use the BDI daily log return data to perform a one-step-ahead forecast. We split the data into the training set from 2015/01/01 to 2016/12/31 and the test set from 2019/01/01 to 2021/04/06 concurrent with the validation set from 2017/01/01 to 2018/12/31. See Fig 11. Prediction model. First, we determine the free parameters in the SVR-ARMA-GARCH model. The orders of the ARMA model and GARCH model are determined by extended autocorrelation function (EACF) and AIC. The free parameters of the SVR model are determined and then an ARMA(2,2)-GARCH(1,1) model that minimizes AIC: We then apply a grid search to the validation set to determine the free parameters of SVR. Parameters γ h , γ g , and � g are selected from {0.01, 0.02, . . ., 0.1}, and parameter � h is selected from {0.1, 0.2, . . ., 1}. Also, parameters C h and C g are selected from {0.5, 0.6, . . ., 1.5}.
For comparison, we also consider three neural network methods: RNN, LSTM, and GRU, with two hidden layers. Here, the number of hidden layer neurons in the network is selected by a grid search from {2, 4, 8, . . ., 256}. We also build a neural network model for theâ 2 t sequence to predict the conditional variance. For measuring the accuracy of prediction, we use metrics: NMES, Sign error, and RMSE_var, wherein the number of repetition is 100. Table 2 shows that the neural network model has better predictive ability than the traditional time series models (ARMA and ARMA-GARCH), but underperforms the SVR-ARMA-GARCH model, and further, the ARE method performs better in the SVR-ARMA-GARCH model.
Change point detection. To perform a residual-based CUSUM test, we use the residuals obtained based on the ARE method and apply T UCL s , T max , and T scale max to the BDI data to detect a change point from 2019/01/01 to 2021/04/06. In this task, we use the rolling window scheme with the width of one year and one day movement. See Fig 12. When a change point is detected, the point exceeding the control limit the most is declared as a change point.  One can guess that a true change point occurs on 2019/12/24 as the COVID-19 outbreak began at that time and triggered a sudden change in the global financial market resulting in a plunge of crash in stock prices. To examine this, the three tests are conducted. The result shows that T UCL s detects a change point most quickly on 2020/01/10, while both T max and T scale max detect the change point on 2020/06/11, which is quite late compared to the first test. See Fig  13. The second change point is claimed to be on 2020/03/30 as the price of Brent crude oil fell 9% at the date to US $23 per barrel which was the lowest price since 2002. As all ships in BDI needed oil to travel, the abnormality of Brent crude oil led to trigger a serious BDI change. In this case, T UCL s , T scale max , and T max appear to detect a change point on 2020/04/14, 2020/05/15, and 2020/06/11, respectively. See Fig 14. This finding reveals that T UCL s can detect a change about a half month after the occurrence of such big events, a lot faster than T max and T scale max , which strongly affirms the validity of our proposed method when used in tandem with the rolling window scheme.

Conclusion
In this study, we proposed an alternating recursive estimation (ARE) for the SVR-ARMA-GARCH model. Our numerical study showed that our one-step-ahead prediction method has better predictive ability and supplies more accurate residual estimation than the method of [1]. We also proposed a new residual-based CUSUM test T UCL s employing a time-varying control limit. We demonstrated via Monte Carlo simulations that T UCL s , T max of [2] and T scale max of [4] have their own advantages of producing higher powers in different time periods. Our findings in the empirical study using the daily BDI (Baltic Dry Index) data demonstrate the superiority of T UCL s over the others in terms of quick detection ability. All these results affirmed the validity of the proposed method.